Filmed from Best Available Copy an Assessment of the Kuder-richardson Formula (20 ) Reliability Estimate for Moderately Speeded Tests
نویسنده
چکیده
BS. DEPARTMENT OF HEALTH, EDUCATION & WELFARE. OFFICE OF EDUCATION THIS DOCUMENT HAS BEEN REPRO "DLICED EXACTLY As RECEIVED FROM THE PERSON OR ORGANIZATION ORIGINATING IT POINTS OF VIEW OR ORINIONS STATED DO NOT NECESSARILY (REPRESENT OFFICIAL OFFICE OF EDOOATIoN POSITION OR POLICY Re.sults obtained by the KuderRdcharddon formula (20 ) adapted for use with R-KW scoring are compared with three other reliability formulas. Based on parallel tests administered at the same sitting the KR (20) estimates are compared with alternate -form correlations and with odd-even correlations adjusted by the Spearman-Brown prophecy formula, Comparisons are also made between KR (20) estimates and alternate-form correlations obtained for tests administered after intervals of six to ten months. All the results justify the use of the Kuder-Richardson procedure with tests that show no more than moderate speededness. NCME Annual Meeting New Orleans, La. February 28 1973 AN ASSESSMENT OF THE KUDER-RICHARDSON FORMULA (20) RELIABILITY ESTIMATE FOR MODERATELY SPEEDED TESTS For same measurement specialists there continues to be doubt the approptiateness of the Kuder-Richardson formula (20) and its close relatives for estimating reliability unless all the examinees finish the test. This point view raises a question of considerable importance becanse in large-scale testing programs it is frequently impractical to provide sufficient time to satisfy the slowest students. Consequently, assigned time limits are likely to represent a compromise between ideal power-test conditions and conditions that may introduce a moderate factor of speededness. The view that has been generally accepted at Educational Testing Service is that a test may be regarded as essentially unspeeded if at least 80 per cent of the examinees reach the last item and if virtually every one reaches three-quarters of the items. Some ETS tests do not quite meet both conditions. Nevertheless, the Kuder-Richardson formulas have been used with a high degree of confidence that they provide good estimates of test reliability. It is the purpose of this paper to present evidence that justifies that` confidence. The Scholastic. Aptitude Test happens to provide such evidence without any need for special testing. The conclusions to be drawn are properly restricted to test material similar to thatof the SAT, although there is every reason to believe that generalizations can be made to other tests with similar speed characteristics. KR 20) versus Alternate -Form Seine Administration The analysis sample for each new form of the SAT is selected from the records of candidates who took one of the equating sections. Each equating section is a parallel form of one of the operational sections with respect to content, timing, and number of items. Listed in Table I are data for the two parallel sections, A,. and B, in thirty SAT forms. Sample sizes range from 370 to 2,000. Frok the per cents. who reached three quarters of the items, is seen that our first condition for an unspeeded test is approximated for all the verbal sections and that the mathematical sections fail to meet it by about 1 to 4 per cent in general and by as much as 11.6 per cent in one instance. Instead of the per cent reaching the last item, our second condition for an unspeeded test, there has been recorded the number of items reached by less than 80 per cent of the group. These figures, too, suggest more speed in the mathematical scores than in the verbal scores.
منابع مشابه
Analysis of use of a single best answer format in an undergraduate medical examination
UNLABELLED Examinations at the Faculty of Medicine of Mu'tah University are based on a single best answer multiple-choice questions (MCQs) format. However, the reliability of this examination format has not been determined. OBJECTIVE Using an examination of obstetrics and gynaecology as a model, this study aims to analyze the difficulty (facility) index, the discriminatory power and reliabili...
متن کاملValidity and reliability of the Structured Clinical Interview for Mood Spectrum: Brazilian version (SCIMOODS-VB).
OBJECTIVE The aim of this study was to translate the Structured Clinical Interview for Mood Spectrum into Brazilian Portuguese, measuring its reliability, validity, and defining scores for bipolar disorders. METHOD Questionnaire was translated (into Brazilian Portuguese) and back-translated into English. Sample consisted of 47 subjects with bipolar disorder, 47 with major depressive disorder,...
متن کاملReconstructing, Investigating the Reliability and Validity and Scoring the Stanford Diagnostic Reading Test
Objectives: The aim of the present study was to reconstruct determining validity, and score The Stanford Diagnostic Reading Test fourth edition (SDRT4) in the sixth grade students. Methods: The population of the study was all sixth grades of the 19 educational districts from Tehran, 571 students (255 boys and 316 girls) were selected by using a random multi-cluster sampling. The data were an...
متن کاملTranslation and validation of the Malay version of the Stroke Knowledge Test
BACKGROUND To date, there is a lack of published studies on assessment tools to evaluate the effectiveness of stroke education programs. METHODS This study developed and validated the Malay language version of the Stroke Knowledge Test research instrument. This study involved translation, validity, and reliability phases. The instrument underwent backward and forward translation of the Englis...
متن کاملAssessing core outcomes in graduates: psychometric evaluation of the Paediatric Intensive Care Unit-Nursing Knowledge and Skills Test.
AIM To develop and psychometrically test the Paediatric Intensive Care Unit-Nursing Knowledge and Skills Test - a multiple-choice test for measuring the key nursing knowledge and skills required for safe, competent practice. BACKGROUND Intensive care graduate nurse residency or orientation programmes are key strategies in the development of safe and competent practitioners. Essential to these...
متن کامل